Model Selection

Speech Recognition

# Speech Recognition

Ultravox V0 5 Llama 3 2 1b GGUF

Ultravox v0.5 is an audio-to-text model optimized from the Llama-3 2.1B architecture, focusing on efficient speech transcription tasks.

Speech Recognition

Hubert Base Librispeech Demo Colab

A speech recognition model fine-tuned from facebook/hubert-large-ls960-ft, trained on the LibriSpeech dataset

Speech Recognition

Wav2vec2 Base Librispeech Demo Colab

This model is a speech recognition model fine-tuned on the LibriSpeech dataset based on facebook/wav2vec2-base, achieving a word error rate of 0.3174 on the evaluation set.

Speech Recognition

Wav2vec Checkpoints

A fine-tuned speech processing model based on facebook/wav2vec2-base, achieving 99.48% accuracy on the evaluation set

Speech Recognition

Zeyadd-Mostaffa

Deepfake Audio Detection

A speech processing model further fine-tuned based on wav2vec2-base-finetuned, achieving 98.82% accuracy on the evaluation set

Speech Recognition

Deepfake Audio Detection

A fine-tuned speech processing model based on wav2vec2-base-finetuned, achieving 98.82% accuracy on the evaluation set

Speech Recognition

Wav2vec2 Phoneme

A speech recognition model fine-tuned based on facebook/wav2vec2-large-xlsr-53, focusing on phoneme recognition tasks

Speech Recognition

Wav2vec2 Base Finetuned

A speech processing model fine-tuned based on the facebook/wav2vec2-base model, achieving 99.97% accuracy on the evaluation set

Speech Recognition

Wav2vec2 Base Finetuned

A speech processing model fine-tuned based on the facebook/wav2vec2-base model, achieving 99.97% accuracy on the evaluation set

Speech Recognition

Wav2vec2 Base Finetuned Ks

An audio classification model fine-tuned on an audio folder dataset based on the wav2vec2-base model, achieving 99.82% accuracy on the validation set

Audio Classification

Whisper Small Dialect Classifier Cross

This model is a dialect classifier based on the whisper-small architecture, designed to recognize and classify speech inputs of specific dialects.

Audio Classification

Bsc Ai Thesis Torgo Model 1

A speech processing model fine-tuned based on facebook/wav2vec2-base, demonstrating excellent performance on the evaluation set

Speech Recognition

Neunit Ks Kangyuan0601

This model is a fine-tuned audio classification model based on facebook/wav2vec2-base on the superb dataset, achieving 99.87% accuracy on the evaluation set.

Audio Classification

Wav2vec2 Base Finetuned Amd

This model is a fine-tuned version of facebook/wav2vec2-base on an unknown dataset, primarily used for speech recognition tasks, achieving an accuracy of 84.55% on the evaluation set.

Speech Recognition

Audio Class Finetuned

This model is a fine-tuned audio classification model based on facebook/wav2vec2-base on the superb dataset, achieving an accuracy of 0.6578 on the evaluation set.

Audio Classification

Wav2vec2 Base Finetuned Ks

A speech recognition model fine-tuned on the superb dataset based on facebook/wav2vec2-base, achieving 98.34% accuracy

Speech Recognition

Whisper Small ISSAI KSC 335RS V2

A small speech recognition model based on the Whisper architecture, suitable for domain-specific speech-to-text tasks

Speech Recognition

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-xls-r-300m, primarily used for English speech-to-text tasks.

Speech Recognition

Wav2vec2 Base Finetuned Ks

This model is a speech recognition model fine-tuned on the SUPERB dataset based on facebook/wav2vec2-base, demonstrating excellent performance in keyword spotting tasks.

Speech Recognition

Wav2vec2 Base Finetuned Ie

A fine-tuned version based on facebook/wav2vec2-base model for specific tasks

Speech Recognition

Wav2vec2 Base Finetuned Ks

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving an accuracy of 87.27% on the evaluation set.

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

This model is a fine-tuned version of facebook/wav2vec2-base, primarily used for speech recognition tasks.

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base

Speech Recognition

Wav2vec2 Base Timit Demo Colab

A speech recognition model fine-tuned on the TIMIT dataset based on the facebook/wav2vec2-base model, featuring a low Word Error Rate (WER).

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

A speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, specializing in English speech-to-text tasks

Speech Recognition

Wav2vec2 Base Timit Demo Google Colab

This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, primarily used for English speech-to-text tasks.

Speech Recognition

Wav2vec2 Base Ft Cv3 V3

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base on the Common Voice 3.0 English dataset, achieving a word error rate of 0.247 on the test set.

Speech Recognition

Wav2vec Trained

This model is a fine-tuned speech recognition model based on facebook/wav2vec2-base, achieving a word error rate of 0.1042 on the evaluation set.

Speech Recognition

A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h

Speech Recognition

A speech recognition model fine-tuned based on facebook/wav2vec2-base-960h, achieving a word error rate of 1.0 on the evaluation set

Speech Recognition

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3355 on the evaluation set.

Speech Recognition

Wav2vec2 Base Dataset Asr Demo Colab

This is a speech recognition model fine-tuned on the superb dataset based on distilhubert, primarily used for Automatic Speech Recognition (ASR) tasks.

Speech Recognition

Test Demo Colab

This is an automatically generated test model, primarily for demonstration and experimental purposes.

Large Language Model

Wav2vec2 Base Timit Demo Google Colab

This model is a speech recognition model fine-tuned on the TIMIT dataset based on facebook/wav2vec2-base, achieving a word error rate (WER) of 0.3384 on the evaluation set.

Speech Recognition

Wav2vec2 Keyword Spotting Int8

A speech keyword detection model based on the wav2vec2 architecture, optimized with Optimum OpenVINO quantization

Speech Recognition

Wac2vec Lllfantomlll

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.3417 on the evaluation set.

Speech Recognition

Wav2vec2 Base Vios Commonvoice 1

This model is a speech recognition model fine-tuned on the Common Voice dataset based on facebook/wav2vec2-xls-r-300m, supporting automatic speech recognition tasks.

Speech Recognition

Wav2vec2 Base Timit Demo Colab53

A speech recognition model fine-tuned based on facebook/wav2vec2-base, suitable for the TIMIT dataset

Speech Recognition

Wav2vec2 Final 1 Lm 4

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set

Speech Recognition

Wav2vec2 Final 1 Lm 3

A speech recognition model fine-tuned based on facebook/wav2vec2-base, achieving a word error rate of 0.4499 on the evaluation set, which can be reduced to 0.126 when using a 4-Gram language model

Speech Recognition

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase